Goto

Collaborating Authors

 vis-instruct4v llav a-rlhf llava-next minigpt



RUST: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models

Neural Information Processing Systems

To perform systematic evaluations, we set up 32 various tasks, including improvements to existing multimodal tasks, extension of text-only tasks to multimodal scenarios, and novel methods for risk assessment, which focus on models' basic performance with practical significance.